Add durable conversation resumability to the Antigravity Interactions harness#180
Add durable conversation resumability to the Antigravity Interactions harness#180zbl94 wants to merge 3 commits into
Conversation
… harness Introduce a small durable key-value store and use it to persist each conversation's interaction-chain cursor (the last interaction id), so a conversation can resume after a process restart instead of starting a new chain. - internal/storage: a minimal Store interface (Get/Put/Delete + ErrNotFound) with documented semantics (atomicity, read-after-write, not-found-vs-error, durability) and a single-writer concurrency model. Includes a filesystem implementation (FileStore) that writes atomically via temp-file + rename. - harness: add AntigravityInteractionsConfig.StateStore; Start loads any persisted cursor and Run persists it after each successful turn, so a fresh Execution for the same conversation continues the existing interaction chain. - harness: document the single-writer-per-conversation expectation on the Harness interface (the controller guarantees it), which is what makes the last-write-wins store correct. Also includes related harness improvements: - Retry HTTP 429 (rate limit) with exponential backoff + jitter, honoring Retry-After; only 429 is retried since it is rejected before any interaction is created. - Terminology cleanup: the within-Run FC/FR loop is the "interaction loop" (continuation turns chained via previous_interaction_id), distinct from an AX-level resume.
Resolve conflict in internal/harness/antigravityinteractions.go: main changed Harness.Start to take a harnessConfig []byte argument (#193) while this branch added StateStore-backed resume-cursor loading to Start. Combined both: Start now accepts harnessConfig, stores it on the execution, and still loads any persisted resume cursor when a StateStore is configured. Build, vet, and full test suite pass.
…ness Address PR review feedback: instead of a repo-level internal/storage key-value abstraction, keep resume-cursor persistence local to the Antigravity Interactions harness. - Remove the internal/storage package (Store interface + FileStore). - Add a harness-local, file-based cursorStore (cursorstore.go) with load/save of the per-conversation resumeCursor. resumeCursor stays a struct so it can grow (e.g. partial function-call results for mid-tool-loop recovery) later. - Replace Config.StateStore with Config.StateDir. StateDir is now required: NewAntigravityInteractionsHarness returns an error if it is empty, and the constructor now returns (*Harness, error). - Add tests with a fake Interactions API (an http.RoundTripper) that records each request and returns a canned SSE stream, covering the resume-across-restart CUJ (a fresh harness over the same StateDir sends the persisted previous_interaction_id), same-harness resume, the required-StateDir error, and a cursorStore load/save round-trip.
| // gap (every resume point), which is the only place the harness can influence an | ||
| // otherwise atomic interaction. | ||
| // Queue carries human input only -- the initial prompt and "steering" messages | ||
| // injected mid-run. It never carries tool results (the harness produces those |
There was a problem hiding this comment.
shall we keep the original comment? iiuc " and "steering" messages injected mid-run." is not supported yet
| // NewAntigravityInteractionsHarness returns an error if it is empty. | ||
| // Correctness relies on a single writer per conversation (the controller | ||
| // guarantees at most one Execution per conversation), so writes are | ||
| // last-write-wins. |
There was a problem hiding this comment.
the controller doesn't enforces this yet. can we phrase this as an expectation/requirement of the caller instead of "the controller guarantees"
| @@ -0,0 +1,119 @@ | |||
| // Copyright 2026 Google LLC | |||
| // | |||
| // Licensed under the Apache License, Version 2.0 (the "License"); | |||
There was a problem hiding this comment.
nit: we can rename antigravityinteractions_cursorstore.go. Not strong on it - likely need to do a folder re-org as follow up anyway
|
not blocking for this PR just one thing we can think of for future - today we persist the cursor after every Interactions API turn, including turns that return function calls. it's fine to scope resumability to completed turns / next user input only. if later we want to recover crash in the middle of a tool loop, we probably need to either persist only at quiescent boundaries or store enough pending tool state. |
Introduce a small durable key-value store and use it to persist each conversation's interaction-chain cursor (the last interaction id), so a conversation can resume after a process restart instead of starting a new chain.
Also includes related harness improvements: